Acoustic-to-articulatory Inversion Using Dynamical and Phonological Constraints

نویسندگان

  • Sorin Dusan
  • Li Deng
چکیده

A well-known difficulty in using the articulatory representation for applications in the areas of speech coding, synthesis and recognition is the poor accuracy in the estimation of the articulatory parameters from the acoustic signal of speech. The difficulty is especially serious for most classes of consonantal sounds. This paper presents a statistical method of estimating the articulatory trajectories from the speech signal based on training databases of articulatory-acoustic parameters obtained from continuous speech utterances. The estimation of articulatory trajectories uses the extended Kalman filtering (EKF) technique and is based on new linguistic constraints imposed to acoustic-to-articulatory inversion. These new constraints are mainly implemented by dividing the whole articulatory-acoustic function into a number of phonological sub-functions, each corresponding to a unit of speech defined as the patterns of the continuous transition between two consecutive phonemes. The articulatory-acoustic sub-function is a part of the state-space model that represents each phonological unit of speech. A method of segmenting the speech signal and recognizing the phonological units was developed based on likelihood computation from Kalman filtering with different models. The final estimation of articulatory trajectories is obtained from Kalman smoother using the parameters of the recognized models. Estimation results compared to articulographic and X-ray speech data are presented in this paper. Average RMS errors of about 2 mm have been obtained between estimated and actual articulatory trajectories.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Methods for Integrating Phonetic and Phonological Knowledge in Speech Inversion

Exploiting the information about the vocal tract shape that produced the speech has been appealing to speech researchers and scientists for a long period of time. Experimental studies that included the articulatory information from physiological measurements supported the idea that this information could be useful in a number of areas of speech science and technology. However, the estimation of...

متن کامل

Mapping between acoustic and articulatory gestures

We propose a method for Acoustic-to-Articulatory Inversion based on acoustic and articulatory ‘gestures’. A definition for these gestures along with a method to segment the measured articulatory trajectories and the acoustic waveform into gestures is suggested. The gestures are parameterized by 2D DCT and 2D-cepstral coefficients respectively. The Acoustic-to-Articulatory Inversion is performed...

متن کامل

Introduction of constraints in an acoustic-to-articulatory inversion method based on a hypercubic articulatory table

Our acoustic to articulatory inversion method exploits an original articulatory table structured in the form of a hypercube hierarchy. The articulatory space is decomposed into regions where the articulatory-to-acoustic mapping is linear. Each region is represented by a hypercube. The inversion procedure retrieves articulatory vectors corresponding to an acoustic entry from the hypercube table....

متن کامل

A Rough Guide to the Acoustic-to-articulatory Inversion of Speech

| This article reviews a speci c speech research area called acoustic-to-articulatory inversion of speech, or speech inversion, which refers to the problem of mapping the acoustic speech signal onto a space describing the conguration of the human vocal tract that actually produced this signal. This space may be modeled in a variety of ways, such as with trajectories of the movement of the artic...

متن کامل

Acoustic-to-articulatory inversion using a speaker-normalized HMM-based speech production model

Acoustic-to-articulatory inverse mapping is a difficult problem because of its non-linear and oneto-many characteristics. We have previously developed a speech inversion method using a hidden Markov model (HMM)-based speech production model which takes into account the phonemespecific dynamic constraints of articulatory parameters. We found that the constraint significantly decreases the estima...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007